Goto

Collaborating Authors

 multivariate function



Convexity Certificates from Hessians (Supplementary Material)

Neural Information Processing Systems

Here, we (1) provide the grammar for the formal language of mathematical expressions to which our certification algorithm is applied, (2) we provide more algorithmic details about our implementation of the Hessian approach, (3) we show that our implementation of the Hessian approach can also certify the remaining differentiable CVX atoms with vector input, which we could not discuss in the main paper because of space constraints, and (4) we provide more examples of differentiable functions that can be certified by the Hessian approach but are missing from CVX's DCP implementation. The formal language for mathematical expressions to which our certification algorithm is applied is specified by the grammar depicted in Figure 1. The language is rich enough to cover all the examples in the main paper and this supplement. In this grammar, number is a placeholder for an arbitrary floating point number, variable is a placeholder for variable names starting with a Latin character and function is a placeholder for the supported elementary differentiable functions like exp, log and sum . Our implementation of the Hessian approach works on vectorized and normalized expression DAGs (directed acyclic graphs) for Hessians that contain every subexpression exactly once.




Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks

Harvey, Thomas R., Ruehle, Fabian, Fraser-Taliente, Kit, Halverson, James

arXiv.org Artificial Intelligence

We present a novel approach to symbolic regression using vision-capable large language models (LLMs) and the ideas behind Google DeepMind's Funsearch. The LLM is given a plot of a univariate function and tasked with proposing an ansatz for that function. The free parameters of the ansatz are fitted using standard numerical optimisers, and a collection of such ansätze make up the population of a genetic algorithm. Unlike other symbolic regression techniques, our method does not require the specification of a set of functions to be used in regression, but with appropriate prompt engineering, we can arbitrarily condition the generative step. By using Kolmogorov Arnold Networks (KANs), we demonstrate that ``univariate is all you need'' for symbolic regression, and extend this method to multivariate functions by learning the univariate function on each edge of a trained KAN. The combined expression is then simplified by further processing with a language model.


Addressing common misinterpretations of KART and UAT in neural network literature

Ismailov, Vugar

arXiv.org Artificial Intelligence

This note addresses the Kolmogorov-Arnold Representation Theorem (KART) and the Universal Approximation Theorem (UAT), focusing on their common misinterpretations in some papers related to neural network approximation. Our remarks aim to support a more accurate understanding of KART and UAT among neural network specialists.


Functional Tensor Decompositions for Physics-Informed Neural Networks

Vemuri, Sai Karthikeya, Büchner, Tim, Niebling, Julia, Denzler, Joachim

arXiv.org Artificial Intelligence

Physics-Informed Neural Networks (PINNs) have shown continuous and increasing promise in approximating partial differential equations (PDEs), although they remain constrained by the curse of dimensionality. In this paper, we propose a generalized PINN version of the classical variable separable method. To do this, we first show that, using the universal approximation theorem, a multivariate function can be approximated by the outer product of neural networks, whose inputs are separated variables. We leverage tensor decomposition forms to separate the variables in a PINN setting. By employing Canonic Polyadic (CP), Tensor-Train (TT), and Tucker decomposition forms within the PINN framework, we create robust architectures for learning multivariate functions from separate neural networks connected by outer products. Our methodology significantly enhances the performance of PINNs, as evidenced by improved results on complex high-dimensional PDEs, including the 3d Helmholtz and 5d Poisson equations, among others. This research underscores the potential of tensor decomposition-based variably separated PINNs to surpass the state-of-the-art, offering a compelling solution to the dimensionality challenge in PDE approximation.


Rethinking the Function of Neurons in KANs

Altarabichi, Mohammed Ghaith

arXiv.org Artificial Intelligence

The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem, which asserts that sum is the only fundamental multivariate function. In this work, we investigate the potential for identifying an alternative multivariate function for KAN neurons that may offer increased practical utility. Our empirical research involves testing various multivariate functions in KAN neurons across a range of benchmark Machine Learning tasks. Our findings indicate that substituting the sum with the average function in KAN neurons results in significant performance enhancements compared to traditional KANs. Our study demonstrates that this minor modification contributes to the stability of training by confining the input to the spline within the effective range of the activation function. Our implementation and experiments are available at: \url{https://github.com/Ghaith81/dropkan}


A Multi-resolution Low-rank Tensor Decomposition

Rozada, Sergio, Marques, Antonio G.

arXiv.org Artificial Intelligence

The PARAFAC decomposition is conceptually simple and its The (efficient and parsimonious) decomposition of higher-order tensors representation complexity scales gracefully (the number of parameters is a fundamental problem with numerous applications in a variety grows linearly with the rank). The Tucker decomposition enjoys of fields. Several methods have been proposed in the literature additional degrees of freedom at the cost of greater complexity (exponential to that end, with the Tucker and PARAFAC decompositions being dependence of the number of parameters with respect to the most prominent ones. Inspired by the latter, in this work the rank). Hierarchical tensor decompositions, such as the Tensor we propose a multi-resolution low-rank tensor decomposition to describe Train (TT) decomposition [8] or a hierarchical Tucker (hTucker) decomposition (approximate) a tensor in a hierarchical fashion. The central [9], try to alleviate this problem. The former unwraps idea of the decomposition is to recast the tensor into multiple lowerdimensional the tensor into a chain of three-dimensional tensors, and the latter tensors to exploit the structure at different levels of resolution.


On the Kolmogorov neural networks

Ismayilova, Aysu, Ismailov, Vugar

arXiv.org Machine Learning

In this paper, we show that the Kolmogorov two hidden layer neural network model with a continuous, discontinuous bounded or unbounded activation function in the second hidden layer can precisely represent continuous, discontinuous bounded and all unbounded multivariate functions, respectively.